185 research outputs found

    Asymptotics for high-dimensional covariance matrices and quadratic forms with applications to the trace functional and shrinkage

    Full text link
    We establish large sample approximations for an arbitray number of bilinear forms of the sample variance-covariance matrix of a high-dimensional vector time series using ℓ1 \ell_1-bounded and small ℓ2\ell_2-bounded weighting vectors. Estimation of the asymptotic covariance structure is also discussed. The results hold true without any constraint on the dimension, the number of forms and the sample size or their ratios. Concrete and potential applications are widespread and cover high-dimensional data science problems such as tests for large numbers of covariances, sparse portfolio optimization and projections onto sparse principal components or more general spanning sets as frequently considered, e.g. in classification and dictionary learning. As two specific applications of our results, we study in greater detail the asymptotics of the trace functional and shrinkage estimation of covariance matrices. In shrinkage estimation, it turns out that the asymptotics differs for weighting vectors bounded away from orthogonaliy and nearly orthogonal ones in the sense that their inner product converges to 0.Comment: 42 page

    Time-frequency analysis of locally stationary Hawkes processes

    Full text link
    Locally stationary Hawkes processes have been introduced in order to generalise classical Hawkes processes away from stationarity by allowing for a time-varying second-order structure. This class of self-exciting point processes has recently attracted a lot of interest in applications in the life sciences (seismology, genomics, neuro-science,...), but also in the modelling of high-frequency financial data. In this contribution we provide a fully developed nonparametric estimation theory of both local mean density and local Bartlett spectra of a locally stationary Hawkes process. In particular we apply our kernel estimation of the spectrum localised both in time and frequency to two data sets of transaction times revealing pertinent features in the data that had not been made visible by classical non-localised approaches based on models with constant fertility functions over time.Comment: Bernoulli journal, A Para{\^i}tr

    Locally stationary long memory estimation

    Get PDF
    There exists a wide literature on modelling strongly dependent time series using a longmemory parameter d, including more recent work on semiparametric wavelet estimation. As a generalization of these latter approaches, in this work we allow the long-memory parameter d to be varying over time. We embed our approach into the framework of locally stationary processes. We show weak consistency and a central limit theorem for our log-regression wavelet estimator of the time-dependent d in a Gaussian context. Both simulations and a real data example complete our work on providing a fairly general approach

    A Multiscale Approach for Statistical Characterization of Functional Images

    Get PDF
    Increasingly, scientific studies yield functional image data, in which the observed data consist of sets of curves recorded on the pixels of the image. Examples include temporal brain response intensities measured by fMRI and NMR frequency spectra measured at each pixel. This article presents a new methodology for improving the characterization of pixels in functional imaging, formulated as a spatial curve clustering problem. Our method operates on curves as a unit. It is nonparametric and involves multiple stages: (i) wavelet thresholding, aggregation, and Neyman truncation to effectively reduce dimensionality; (ii) clustering based on an extended EM algorithm; and (iii) multiscale penalized dyadic partitioning to create a spatial segmentation. We motivate the different stages with theoretical considerations and arguments, and illustrate the overall procedure on simulated and real datasets. Our method appears to offer substantial improvements over monoscale pixel-wise methods. An Appendix which gives some theoretical justifications of the methodology, computer code, documentation and dataset are available in the online supplements

    Intrinsic data depth for Hermitian positive definite matrices

    Full text link
    Nondegenerate covariance, correlation and spectral density matrices are necessarily symmetric or Hermitian and positive definite. The main contribution of this paper is the development of statistical data depths for collections of Hermitian positive definite matrices by exploiting the geometric structure of the space as a Riemannian manifold. The depth functions allow one to naturally characterize most central or outlying matrices, but also provide a practical framework for inference in the context of samples of positive definite matrices. First, the desired properties of an intrinsic data depth function acting on the space of Hermitian positive definite matrices are presented. Second, we propose two computationally fast pointwise and integrated data depth functions that satisfy each of these requirements and investigate several robustness and efficiency aspects. As an application, we construct depth-based confidence regions for the intrinsic mean of a sample of positive definite matrices, which is applied to the exploratory analysis of a collection of covariance matrices associated to a multicenter research trial

    Fitting dynamic factor models to non-stationary time series

    Get PDF
    Factor modelling of a large time series panel has widely proven useful to reduce its cross-sectional dimensionality. This is done by explaining common co-movements in the panel through the existence of a small number of common components, up to some idiosyncratic behaviour of each individual series. To capture serial correlation in the common components, a dynamic structure is used as in traditional (uni- or multivariate) time series analysis of second order structure, i.e. allowing for infinite-length filtering of the factors via dynamic loadings. In this paper, motivated from economic data observed over long time periods which show smooth transitions over time in their covariance structure, we allow the dynamic structure of the factor model to be non-stationary over time, by proposing a deterministic time variation of its loadings. In this respect we generalise existing recent work on static factor models with time-varying loadings as well as the classical, i.e. stationary, dynamic approximate factor model. Motivated from the stationary case, we estimate the common components of our dynamic factor model by the eigenvectors of a consistent estimator of the now time-varying spectral density matrix of the underlying data-generating process. This can be seen as time-varying principal components approach in the frequency domain. We derive consistency of this estimator in a "double-asymptotic" framework of both cross-section and time dimension tending to infinity. A simulation study illustrates the performance of our estimators.econometrics;

    Multiariate Wavelet-based sahpe preserving estimation for dependant observation

    Get PDF
    We present a new approach on shape preserving estimation of probability distribution and density functions using wavelet methodology for multivariate dependent data. Our estimators preserve shape constraints such as monotonicity, positivity and integration to one, and allow for low spatial regularity of the underlying functions. As important application, we discuss conditional quantile estimation for financial time series data. We show that our methodology can be easily implemented with B-splines, and performs well in a finite sample situation, through Monte Carlo simulations.Conditional quantile; time series; shape preserving wavelet estimation; B-splines; multivariate process

    Structural shrinkage of nonparametric spectral estimators for multivariate time series

    Get PDF
    In this paper we investigate the performance of periodogram based estimators of the spectral density matrix of possibly high-dimensional time series. We suggest and study shrinkage as a remedy against numerical instabilities due to deteriorating condition numbers of (kernel) smoothed periodogram matrices. Moreover, shrinking the empirical eigenvalues in the frequency domain towards one another also improves at the same time the Mean Squared Error (MSE) of these widely used nonparametric spectral estimators. Compared to some existing time domain approaches, restricted to i.i.d. data, in the frequency domain it is necessary to take the size of the smoothing span as "effective or local sample size" into account. While B\"{o}hm and von Sachs (2007) proposes a multiple of the identity matrix as optimal shrinkage target in the absence of knowledge about the multidimensional structure of the data, here we consider "structural" shrinkage. We assume that the spectral structure of the data is induced by underlying factors. However, in contrast to actual factor modelling suffering from the need to choose the number of factors, we suggest a model-free approach. Our final estimator is the asymptotically MSE-optimal linear combination of the smoothed periodogram and the parametric estimator based on an underfitting (and hence deliberately misspecified) factor model. We complete our theoretical considerations by some extensive simulation studies. In the situation of data generated from a higher-order factor model, we compare all four types of involved estimators (including the one of B\"{o}hm and von Sachs (2007)).Comment: Published in at http://dx.doi.org/10.1214/08-EJS236 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • 

    corecore